AITopics | binocular vision

CNN {2}: Viewpoint Generalization via a Binocular Vision

Neural Information Processing SystemsDec-26-2025, 01:25:29 GMT

The Convolutional Neural Networks (CNNs) have laid the foundation for many techniques in various applications. Despite achieving remarkable performance in some tasks, the 3D viewpoint generalizability of CNNs is still far behind humans visual capabilities. Although recent efforts, such as the Capsule Networks, have been made to address this issue, these new models are either hard to train and/or incompatible with existing CNN-based techniques specialized for different applications. Observing that humans use binocular vision to understand the world, we study in this paper whether the 3D viewpoint generalizability of CNNs can be achieved via a binocular vision. We propose CNN^{2}, a CNN that takes two images as input, which resembles the process of an object being viewed from the left eye and the right eye. CNN^{2} uses novel augmentation, pooling, and convolutional layers to learn a sense of three-dimensionality in a recursive manner. Empirical evaluation shows that CNN^{2} has improved viewpoint generalizability compared to vanilla CNNs. Furthermore, CNN^{2} is easy to implement and train, and is compatible with existing CNN-based specialized techniques for different applications.

name change, viewpoint generalizability, viewpoint generalization, (5 more...)

Neural Information Processing Systems

Industry:

Media > Television (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Add feedback

Why do horses have eyes on the side of their head?

Why do horses have eyes on the side of their head? 'You often have to teach horses something on both sides of their body for them to process the information fully.' In the animal kingdom, horses are prey. Breakthroughs, discoveries, and DIY tips sent every weekday. Have you ever noticed that horses have eyes on the sides of the head rather than the front, like we do as humans? The location of horses' eyes offer a biological advantage that helps keep them safe as prey animals.

laura baisa, night vision, whitaker, (12 more...)

Popular Science

Country:

North America > United States > Texas (0.05)
North America > United States > New Jersey (0.05)
North America > United States > California > Yolo County > Davis (0.05)

Industry:

Media > Photography (0.72)
Health & Medicine (0.70)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.36)

Add feedback

Reviews: CNN {2}: Viewpoint Generalization via a Binocular Vision

Neural Information Processing SystemsJan-27-2025, 11:47:36 GMT

Originality: To my knowledge, the motivation for such dual-pathway design is not new. But the particular design of this paper, CM polling in particular, is definitely novel. Quality: I think the evaluation of this work is quite thorough, but missing some important items. It seems that using CM pooling in vanilla CNNs is not not shown in the paper. This makes it less clear if the this pooling actually improves the performance of vanilla CNNs. 2. Missing Vanilla CNN tuning details.

artificial intelligence, binocular vision, viewpoint generalization, (2 more...)

Neural Information Processing Systems

Industry:

Media > Television (0.64)
Leisure & Entertainment (0.64)

Technology: Information Technology > Artificial Intelligence (0.43)

Add feedback

CNN {2}: Viewpoint Generalization via a Binocular Vision

Neural Information Processing SystemsOct-11-2024, 01:44:59 GMT

The Convolutional Neural Networks (CNNs) have laid the foundation for many techniques in various applications. Despite achieving remarkable performance in some tasks, the 3D viewpoint generalizability of CNNs is still far behind humans visual capabilities. Although recent efforts, such as the Capsule Networks, have been made to address this issue, these new models are either hard to train and/or incompatible with existing CNN-based techniques specialized for different applications. Observing that humans use binocular vision to understand the world, we study in this paper whether the 3D viewpoint generalizability of CNNs can be achieved via a binocular vision. We propose CNN {2}, a CNN that takes two images as input, which resembles the process of an object being viewed from the left eye and the right eye.

cnn, viewpoint generalizability, viewpoint generalization, (3 more...)

Neural Information Processing Systems

Industry:

Media > Television (0.80)
Leisure & Entertainment (0.80)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.82)

Add feedback

Low-cost Stereovision system (disparity map) for few dollars

Ildar, R., Pomazov, E.

arXiv.org Artificial IntelligenceJun-1-2021

The paper presents an analysis of the latest developments in the field of stereo vision in the low-cost segment, both for prototypes and for industrial designs. We described the theory of stereo vision and presented information about cameras and data transfer protocols and their compatibility with various devices. The theory in the field of image processing for stereo vision processes is considered and the calibration process is described in detail. Ultimately, we presented the developed stereo vision system and provided the main points that need to be considered when developing such systems. The final, we presented software for adjusting stereo vision parameters in real-time in the python language in the Windows operating system.

disparity map, distortion, stereo vision, (13 more...)

arXiv.org Artificial Intelligence

2106.00905

Country:

North America > United States > California > Napa County > Napa (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Asia > Japan > Honshū > Kansai > Hyogo Prefecture > Kobe (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.64)

Industry: Media (0.47)

Technology: Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)

Add feedback

Self-Calibrating Active Binocular Vision via Active Efficient Coding with Deep Autoencoders

Wilmot, Charles, Shi, Bertram E., Triesch, Jochen

arXiv.org Artificial IntelligenceJan-27-2021

We present a model of the self-calibration of active binocular vision comprising the simultaneous learning of visual representations, vergence, and pursuit eye movements. The model follows the principle of Active Efficient Coding (AEC), a recent extension of the classic Efficient Coding Hypothesis to active perception. In contrast to previous AEC models, the present model uses deep autoencoders to learn sensory representations. We also propose a new formulation of the intrinsic motivation signal that guides the learning of behavior. We demonstrate the performance of the model in simulations.

active efficient coding, open publication active efficient coding, reconstruction error, (13 more...)

arXiv.org Artificial Intelligence

2101.11391

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)
Asia > China > Hong Kong (0.04)
Europe > United Kingdom > England > Bristol (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

CNN {2}: Viewpoint Generalization via a Binocular Vision

Chen, Wei-Da, Wu, Shan-Hung (Brandon)

Neural Information Processing SystemsMar-18-2020, 21:15:39 GMT

The Convolutional Neural Networks (CNNs) have laid the foundation for many techniques in various applications. Despite achieving remarkable performance in some tasks, the 3D viewpoint generalizability of CNNs is still far behind humans visual capabilities. Although recent efforts, such as the Capsule Networks, have been made to address this issue, these new models are either hard to train and/or incompatible with existing CNN-based techniques specialized for different applications. Observing that humans use binocular vision to understand the world, we study in this paper whether the 3D viewpoint generalizability of CNNs can be achieved via a binocular vision. We propose CNN {2}, a CNN that takes two images as input, which resembles the process of an object being viewed from the left eye and the right eye.

cnn, viewpoint generalizability, viewpoint generalization, (3 more...)

Neural Information Processing Systems

Industry: